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I.  INTRODUCTION 

To  a  pure  mathematician,  the  zeros  of  the  characteristic 
polynomial  for  a  matrix  are  known  as  characteristic  roots .   To 
applied  mathematicians,  engineers,  physicists,  etc.,  they  are 
known  as  eigenvalues,  secular  values,  latent  roots,  or  proper 
values.   Whatever  field  of  natural  sciences  one  may  be  in,  the 
characteristic  roots  of  matrices  play  an  important  role. 

Given  a  square  matrix  A  =  (a^j)  of  order  n,  where  a ..  are 
elements  of  a  field,  any  characteristic  root  w  of  A  is  a  solution 
to  the  characteristic  equation 

[1,1]  det  (A  -  wl)  =  0. 

Associated  with  any  nonzero  characteristic  root  w  is  a  nonzero 
characteristic  vector  X  =  (x, ,  x„ ,  . . . ,  x  ) '  which  is  a  non- 
trivial  solution  to  the  system  of  linear  equations 

AX  =  wX, 

or,  equivalently ,  the  system 

n 

[1,2]  N   akAxA  =  wxk'  k  ~    1'  2'  *'*'  n* 

\-l 

Due  to  the  importance  of  this  system  of  linear  equations , 
[1,2],  it  will  be  referenced  as  such  whenever  it  is  used  through- 
out this  paper.   In  using  this  system  of  equations,  we  will  some- 
times be  interested  only  in  the  two  or  occasionally  three  equations 

which  involve  the  largest  values  for  the  x.,  say  |x    and  |x  I. 

x  J     '    ml  '  p ' 

However,  when  this  is  being  done,  it  will  be  stated  at  that  time. 


It  is  the  objective  of  this  paper  to  organize  some  of  the 
existing  bounds  for  the  characteristic  roots  of  several  types  of 
matrices.   In  order  to  keep  this  paper  to  a  reasonable  length, 
and  yet  retain  the  continuity  of  the  presentation,  it  will  be 
necessary  to  state  without  proof  some  of  the  first-established 
bounds  and  also  a  few  basic  theorems  on  inequalities . 

The  two  types  of  matrices  to  be  considered  are,  first  of  all, 
matrices  with  arbitrary  real  or  complex  elements,  and  secondly, 
stochastic  and  generalized  stochastic  matrices . 

Some  of  the  bounds  will  be  easily  interpreted  geometrically 
while  others  are  either  primarily  theoretical  or  else  give  rise 
to  another  bound  which  is  sharper  and  easier  to  apply. 

Some  bounds  for  the  characteristic  roots  of  an  arbitrary 
square  matrix  of  order  n  have  been  known  for  a  long  time .   The 
first  results  that  specifically  gave  bounds  for  the  characteris- 
tic roots  of  a  real  matrix  were  due  to  Ivar  Bendixson  and  are 
dated  1900.   Then,  shortly  after  1900,  A.  Hirsch  proved  a  theorem 
which  established  the  first  bound  obtained  for  the  characteristic 
roots  of  an  arbitrary  matrix  with  real  or  complex  elements .   Some 
of  these  first-established  bounds  will  be  given  in  this  paper  as 
a  foundation  for  the  development  of  some  of  the  more  recently 

obtained  bounds . 

Throughout  this  paper,  the  terms  circle,  disc  and  oval  are 
used  to  mean  both  the  closed  curve  and  its  interior.   Thus, 
inequalities  are  used  rather  than  equalities.   However,  the 
phrase  "in  or  on"  is  used  to  prevent  misunderstanding. 


II.   BACKGROUND  AND  BASIC  BOUNDS 

Let  A  =  (a,  ,)  be  an  arbitrary  square  matrix  of  order  n. 
A.  Hirsch  proved  the  following  theorem  [11,1.3.12: 

Theorem  (II.  1).   Any  characteristic  root  w  of  A  satisfies 
the  inequality 

|  w  |  <  n  max  |  a,  ,  |  . 
k  ,A 

Later,  I.  Schur  proved  the  following  theorem  [lljl.U.lJ: 

Theorem  (II.  2).   If  w   denotes  the  v    characteristic  root 

v 


of  an  arbitrary  square  matrix  A  =  (av-)i )  of  order  n,  then 

2 


lkA 

'  a. 


v=l         k,A  =  l 


Because  Theorem  (II.  2)  was  arrived  at  after  Theorem  (II. 1),  it 
should  contain  a  more  precise  bound.   Indeed  it  does,  since  In 
Schur's  theorem,  one  considers  all  | a,, | ,  not  just  max  |a,,|. 
For  any  arbitrary  square  matrix  of  order  n  with  real  or 
complex  coefficients ,  we  define 
n  n 


X=l  k=l 


a^^ |  =  T^,  A  =  1,2,. ..,n, 


and  call  these  the  k   row  and  A  '  column  sum  respectively. 

Theorem  (II. 3).   For  any  non-zero  characteristic  root  w  of 

A, 

2 


|w|   £max(k)  (SkTk) 


. 

For  the  proof,  we  shall 

consider  two  cases .   For  the  first 

case,  we  assume  that  S,  i    0  for  all  k.   After  proving  the  theorem 

for  S,  t    0 ,  we  then  assume  that  Sv  =  0  for  some  k,  which  actually 

means  that  the  elements  of  the  k   row  are  zeros.   We  shall  then 

consider  the  characteristic  roots  of  a  matrix  similar  to  A  in 

which  the  n   row  is  the  row 

of  zeros.   This  will  then  imply  that 

the  n    component  of  the  characteristic  vector  associated  with 

the  root  must  be  zero. 

Proof.   Assume  S,  ^  0  for  all  k. 

Recall  (1.2)  that  the  basic  system  of  linear  equations  for  any 

characteristic  root  w  is 

n 

wx, 
k 

=  Z_,  akAXA' 

X=l 

By  taking  absolute  values  of 

both  sides  of  this  equality  and 

• 

applying  the  triangular  inequality  to  the  right  side,  we  obtain 

n 

n 

[11,1]    |w||xk|  <  ^T  |akJ 

'xaI  =  2L  |akxl1/2lakAi1/2ixxl- 

A  =  l 

X=l 

If  we  square  both  sides  of  [11,1],  the  inequality  becomes 

[11,2]       |w|2|xk|2  AS 

1    l1/2l    l1/2l   II2 

lakxl   lakx'   lxxl  • 

X\--\ 

/ 

Applying  the  Cauchy-Schwartz 

inequality  in  the  form  of 

/  n      v 

/  n       n     \ 

I  V*2 

<- 1  ^2 1  »A 

Vx=i         / 

U=l      A  =  l     / 

to  the  right  side  of  [11,2], 

we  obtain 

i 

n       \  /  n 


I  MS 


Iw|2|xk|2  <  (  y  |akA| 


X=l      '  V X=l 

and  hence , 

n 
[11,3]  |w|2|xk|2    <    Sk  Y     |akJ|xA 


X=l 

Since  S,  /  0  for  all  k,  we  can  then  divide  the  k 
k 


inequality  by  Sv  to  obtain 

,2 


k 

2    n 


,2  |xk 

wl     -^T^Z,    |akA" 

k    fcl 


Upon  summing  these  over  k,  we  will  then  introduce  TV,  obtaining 

k=l         A  =  l  k=l  A  =  l 

Replacing  the  summation  on  A  by  a  summation  on  k ,  we  have 

n   i   ,  2     n 


^X^tr-X^Ki 


k=i    k    k=l 

Since  our  summations  are  both  over  k,  we  can  subtract  the  right 
side  of  the  inequality  from  the  left  side,  and  combine  the 
summations  into  one  which  leaves 


k  =  l 


n 


2 
Since  |  x,  |   >_  0 ,  there  must  be  at  least  one  value  of  k,  say  d, 

such  that 


T,  <  0 


S0     d  - 


and  thus 


|w|2  <  SdTd  <  max(k)  (S^)   if  S^.   t    0 ,  k  =  1,  2 ,  . . . ,  n. 


Assume  that  S,  =  0  for  one  value  of  k,  say  k  =  i.   Consider 
the  matrix  B  obtained  from  A  by  a  permutation  of  the  i    and  n 
rows  and  columns.   Since  this  permutation  is  a  similarity  trans- 
formation, B  will  have  the  same  characteristic  roots  as  A.   With 
the  assumption  that  w  i    0 ,  we  must  have  the  n    component  of  the 
characteristic  vector  corresponding  to  w  equal  to  zero.   Thus, 
in  inequality  [11,3],  we  need  to  sum  only  to  n-1,  obtaining  n-1 
inequalities  of  the  form: 

n-1 
|w|2|xk|2  <  Sk  2^  UkA!  |xj2,  k  =  1,  2,  ...,  n-1. 
A  =  l 

Recall  that  S   =  0  really  means  that  la  ,  I  =  0  for  all  k. 

n  J  >    nk ' 

Thus,  for  the  matrix  A,  we  may  suppose  S,  i    0,  k  =  1,  2,  ...,  n-1, 

We  use  the  same  argument  as  above  to  obtain 

n-1 
|w|2  £max(k)  CSk  £  |aAk|)  -  max(k)  (S^). 

X=l 

If  S .  =  0  and  S.  =  0,  we  again  consider  the  similarity 

transformations  such  that  S   =  0  and  S   ,  =  0  and  apply  the  same 

n  n-1  rr   J 

reasoning  as  above.   Hence,  by  continuing  this  process,  the 
theorem  is  proved. 

Prior  to  the  establishment  of  the  bound  in  this  theorem, 


several  other  bounds  were  known  which  turn  out  to  be  special  cases 
of  this  theorem.   W.  V.  Parker  stated  and  proved  the  following 
result  [11;  p.  144]. 


Theorem  (II. 4).   If  we  let  S  =  max„. 

|  w  |  <  S  . 


csk  +  V 


,  then 


This  follows  directly  by  considering 

1/2           (Sk  +  V 
I  w  |  <  max(k)  (skTk)     1  max(k)  2 "  S' 

The  first  inequality  is  a  direct  consequence  of  Theorem  (II. 3), 
while  the  second  is  true  because  the  geometric  mean  is  less  than 
or  equal  to  the  arithmetic  mean  for  non-negative  numbers . 

A.  B.  Farnell  stated  and  proved  the  following  important 
theorem  [11;  p.  144]. 

Theorem  (II.  5).   If  S  =  max(k)  S   and  T  =  max,^.  Tk ,  then 

| w | 2  <  ST. 

This  is  a  consequence  of  Theorem  (II.  3)  since  the  maximum  of  a 
product  is  less  than  or  equal  to  the  product  of  the  maxima. 
Although  this  bound  is  somewhat  weaker  than  the  bound  from 
Theorem  (II. 3),  it  is,  however,  easier  to  apply. 

Several  of  the  circular  bounds  given  thus  far  are  illus- 
trated in  graph  #1  for  the  matrix 


A 


1+i     2-3i    l-2i 

3     -2+i      1 
2-i     -1      1 


GRAPH  #1 
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III.   BOUNDS  FOR  THE  CHARACTERISTIC  ROOTS  OF  AN 

ARBITRARY  SQUARE  MATRIX 

Let  A  -    (a  ,)  be  a  square  matrix  with  real  or  complex 

KA                                                             , 

elements.   Introduce 

2.  'akx'  =  Pk?  k  =  1,  2,  ...,  n. 

X=l 

A*k 

Call  this  the  k   row  radius . 

Theorem  (III.l).   Each  characteristic  root  w  of  A  lies  in 

the  interior  or  on  the  boundary  of  at  least  one  of  the  n  circles 

• 
| z  -  akk |  lPk>  k  "  1 j  2,  .  .  .  ,  n. 

Proof.   For  every  non-zero  characteristic  root  w,  the  system 

of  linear  equations,  [1,2], 

■ 

n 

2,  akAxA  =  WXk'  k  =  1,  2,  ...,  n, 

A  =  l 

has  a  non-trivial  solution  (x,  ,  x„  ,  ...,  x  ).   Assume  that  |x  |  >_ 

-hV> 

|x.|,  j  =  1,  2,  ...,  n,  j  i   v.   Then,  consider  the  v    equation 

of  this  system: 

• 
n 

)   a  ,x,  =  wx  , 
/_,   vA  A      v' 

A  =  l 

or  equivalently ,  by  transposing  a   x   to  the  right  side  of  the 

equation, 

.      n 

/   a  ,x.  =  (w  -  a   )x 
/. ,   vA  A         vv   v 

A  =  l 

■ 

10 


Taking  absolute  values  of  both  sides  of  the  equation,  and  applying 

the  triangle  inequality,  we  have 

n  n  n 

l"-*vvlKI    ■    lZ    avX*J    lZ    |3vXl|Xxl    iZ    KJM' 

X=l  X=l  A  =  l 

Xiv  \tv  \tv 

The  last  inequality  follows  since  |x  |  >_  |x.|  for  all  j  i   v.   Since 
(x,  ,  x„,  ...,  x  )  was  a  non-trivial  solution,  |x  |  i    0.   Thus 


w  -  a    < 
vv '  — 


n 

V  la  J  =  P  . 

/  v  '  vA '     v 


A  =  l 

Hence  w  lies  in  the  interior  or  on  the  boundary  of  at  least  one 
of  the  n  circles  |z  -  a,  .  |  <  P,  ,  k  =  1,  2,  ...,  n,  and  the  theorem 
is  proved.   These  closed  circles  are  called  Gersgorin  discs. 
Analogously,  if  we  define 
n 


/   |akA|  =  QA,  A  =  1,  2,  . . . ,  n. 


k=l 

and  call  this  the  A    column  radius,  then  each  characteristic 
root  w  of  A  lies  in  the  interior  or  on  the  boundary  of  at  least 
one  of  the  n  circles 

lz  ~  3aa'  -qa'  a  =  1,   2'  •'•'  n* 

We  now  use  this  result  to  prove  the  following: 

Theorem  (III. 2).   The  absolute  value  of  each  characteristic 
root  is  less  than  or  equal  to  min(S,  T) ,  remembering  that 


11 

S  =  max.,  *S.  and  T 
(k)  k 

=  max(k)Tk. 

Proof .   From 

Theorem  (III . 1) ,  we  have  the  Gersgorin  discs 

l» 

-  *kkl  £Pk  and  |z  -  aXA |  <  Qx . 

Thus,  for  any  characteristic  root  w, 

■    |w 

"  akkl  <  Pk  and  |w  -  a^  |  <  QA . 

■  ■ 

If  we  apply  the  basic  inequality  that 

|w  "  akkl  >  |w|  -  |akk|, 

we  obtain 

|w|  - 

- 

|akk|  <  Pk  and  |w|  -  |«^j  <  Q^ . 

By  transposing,  we 

complete  the  proof  with 

|w|  <  Pk  + 

|ak]<|  =  Sk  and  |w|  <  QA  +  |«^|  =  Tx . 

Since  k  and  A  were 

arbitrary,  we  obtain  the  result 

|w|  <  min(S,  T) . 

Since  | akk |  + 

P,  =  Sv  5  we  have  that  all  of  the  Gersgorin 

discs  for  the  row  : 

radii  lie  within  the  circle  |z|  <  S.   Hence,  if 

| a,  .  |  is  the  minimum  of  the  diagonal  elements,  then  the  Gersgorin 

disc  centered  at  this  a,,  will  be  tangent  to  the  circle  |z|  <  S. 

JC.K                                                 *~ 

If  |akk|  =  0,  then 

this  Gersgorin  disc  will  be  the  circle  |z|  <  s. 

A  better  bound  than  either  the  Gersgorin  discs  or  the  disc 

from  Theorem  (III. 

2)  is  presented  in  the  next  theorem. 

The  Gersgorin 

discs  for  both  the  row  radii  and  the  column 

12 


radii  for  the  matrix  displayed  in  section  II  are  illustrated  in 
graph  #2,  page  18.   The  shaded  portion  indicates  the  area  con- 
tained in  the  union  of  the  six  Gersgorin  discs,  but  not  their 
intersection.   The  dash-constructed  circles  are  for  the  row  radii 
and  the  solid-constructed  circles  are  for  the  column  radii.   The 
circle  with  center  at  (1,  0)  is  for  both  row  and  column  radii. 
Since  all  of  the  characteristic  roots  of  A  must  lie  both  in  the 
union  of  the  three  discs  using  row  radii  and  also  in  the  union 
of  the  three  discs  using  column  radii,  they  must  all  lie  in  the 
intersection  of  all  six  discs.   Graphically,  this  is  the  area 
enclosed  by  the  shaded  portion. 

The  largest  circle,  centered  at  the  origin,  is  the  bound 
from  Theorem  (III.  2).   Note  that  it  encloses  the  intersection  of 
the  six  Gersgorin  discs,  but  not  the  union  of  them. 

Theorem  (III. 3).   Each  characteristic  root  w  of  A  =  (a,  ,) 

,  .    .           ,           r-    ,-,   fn\    n(n-l)     ,    ^   „         .     . 
lies  in  or  on  at  least  one  of  the  [„]  =  ^ ovals  of  Cassmi 

|z  -  akk|  |z  -  aAA|  <  P'kPx,  k,A   =  1,  2 ,  . . . ,  n,  k  i   X. 

These  ovals  are  a  specialization  of  the  generalized  lemni- 
scates.   For  a  detailed  geometric  interpretation  of  these  ovals, 
see  [2,  f]. 

Proof.   As  in  Theorem  (III.l),  there  will  exist  a  non-zero 

characteristic  vector  (x, ,  x0 ,  . . . ,  x  ) '  corresponding  to  each 

12        n         r      & 

non-zero  characteristic  root  w  which  will  be  a  non-trivial 
solution  to  the  system  of  linear  equations,  [1,2], 


13 

n 

2a  akAXA  =   WXk'  ^  =  1,  2,  ... 

>  n. 

A  =  l 

Let 

x   and  x   be  the  two  largest  components 
m       V                 s>                    tr 

in  absolute 

value 

of 

Cxa 

,   X_ ,   . . . . 

x  )  ,  with 

1*   1 

1  m1 

1    lxvl  1    \x±\  s  i  =  !«  2'  •••'  n 

,  i  t   m,  v. 

Con 

sider  now 

the  m   and  the  v    equations  o 

f  this  system: 

■ 

n                  n 

)   a  -.x-,  =  wx   and  /   a  -,x,  = 
/  .   mA  A     m     /__i       vA  A 

wx  . 

V 

A=l                 A=l 

By 

transposing  a   x   and  a   x   respectively, 
r     °      mm  m      vv  v    r       *  * 

these  equat 

ions 

can 

equ 

ivalently 
n 

^—7 

be  written  as 

n 

)   a  -.x-,  =  (w  -  a   )x   and  )   a  ,x,  = 
/  v   mA  A         mm  m     g_mX       vA  A 

(w  -  a   )x 
vv   V 

• 

A  =  l 

A  =  l 

A^m 

X*v 

Suppose 

x   =0.   Then  x. '=  0  for  all  i 

V                  X 

i   m.   Since  w  i    0 

3 

all 

of  the  x. 

i 

in  the  characteristic  vector  ( 

*"]  3   ^O  3    •  •  •  3 

x  )' 

n 

corresponding 

;  to  w  cannot  be  zero.   Hence,  x 

i    0 .   Thus 
m 

,  the 

th 
m 

equation 

becomes 

0  =  (w  -  a   )x   =  wx   -  a  x 

mm  m     m    mm 

m" 

By  transposir 

ig  a   x   and  then  dividing  both 
mm  m                  6 

sides  by  the 

non- 

zero 

m 

we  get  w 

=  a   .   Thus ,  w  is  trivially  in 
mm        '              J 

|z  -  a   1 1 z  -  a     <PP. 
1      mm1 '      vv '  —  m  v 

the  oval 

If  x  i    0 ,  we 

V 

multiply  both  sides  of  the  m 

equation  by 

the 

14 


corresponding  sides  of  the  v    equation,  to  obtain 

n  n 

(w  -  a   )  (w  -  a   )x  x   =   /   a  axt,   /  a„\x\    • 
mm       vv  m  v     [_±       mA  A   £_,   vA  A 

A  =  l         A=l 

A^m        A^v 

Upon  taking  absolute  values  of  both  sides  and  factoring  out  | xm | 
and  |x  |,  the  largest  components  of  the  | x . | ,  from  each  summation, 

we  have 

n  v^ 

|w  -  a    |w  -  a    |x  x    <  I  x  I   >   a  ,  Ix  I  >   a  ,  . 
1      mm1  '      vv '  !  m  v '  —  '  v '  £_j   mA  '  m '  /_>   vA 

A=l  X=l 

A^m  A^v 

Now,  applying  the  triangle  inequality,  we  then  introduce  P^  as 
desired  to  obtain 

CIII»13      'W  "  ammHW  "  avvHXmXvl  ±  I  Xm 


n 

z 

Kx\ 

n 

Kx 

A=l 

X  =  l 

A^m 

Xj*v 

=    X   P    X   P   . 

1  m1  m1  v '  v 

The  last  equality  is  by  the  definition  of  the  P.,  and  hence,  by 
dividing  through  by  |x  ||x  |,  the  inequality  becomes 

[III. 2]  Iw  -  a    Iw  -  a   |<PP. 

u    '  I      mm1 '      vv '  —  m  v 

Since  Ix    ^  0  and  |x    i    0,  this  division  is  permissible.   Thus 
1  m1  '  v ' 

w  lies  in  or  on  the  oval 


z-a     z-a     <PP 

mm1  '      vv '  —  m  v 


and  the  theorem  is  proved. 

If  the  column  radii  are  used  instead  of  the  row  radii,  we  get 
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a  similar  result.   In  this  case,  the  statement  is  that  w  lies  in 
or  on  at  least  one  of  the  —  n  ~ ovals  of  Cassini 


I z  "  akk  I  I z  -  aAx  I  -   QkQA '  k,A  =  1 '  2'  '">    n'  k  *  A* 
Suppose  a,,  <  a,,.   Then,  it  is  true  that  any  point  which 

A  A      JCK 

lies  in  the  oval 

'Z  "  akk"z  "  aAx'  i  QkQA 
also  lies  in  the  oval 

•Z  -  aAA"Z  -  aAA>  i  W 
But  this  latter  oval  is  merely  the  Gersgorin  disc 

|z  -  axJ  <  QA. 

Hence,  the  above  oval  lies  within  the  union  of  the  two  Gersgorin 
discs 

'Z  "  akk'  i  Pk   and   'Z  -  aAA>  1  PA' 

and  this  is  indeed  a  better  result.   If  k  =  A ,  these  ovals  become 
the  Gersgorin  discs. 

These  ovals  of  Cassini  are  somewhat  difficult  to  construct 
because,  to  find  any  point  on  the  curve,  one  must  solve  a  fourth 
degree  polynomial  equation  in  x  and  y.   However,  these  ovals  are 
symmetric  about  the  line  joining  their  "foci",  that  is,  the  line 
joining  the  two  diagonal  entries  being  considered.   They  are  also 
symmetric  about  the  perpendicular  bisector  of  the  segment  joining 
their  "foci".   Hence,  four  points  on  the  oval  are  quite  easily 
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found . 

For  the  matrix  displayed  in  section  II,  the  ovals 

using  the 

row  radii  are  illustrated  in  graph  #3.   For  the  ovals 

|z  -  (1  +  i)||z  -  (-2  +  i)|  <  C/L3  +  vT)4 

and 

|z  -  (1  +  i)|  |z  -  1|  <  C/13  +  /T  +  1), 

the  four  points  on  each  curve  are  easy  to  find  because 

their 

lines  of  symmetry  are  parallel  to  the  x-axis  and  the  y- 

-axis .   The 

equations  for  these  lines,  first  for  the  lines  joining 

their 

"foci",  and  then  for  the  perpendicular  bisectors  of  the 

s  segments 

joining  their  "foci"  are,  for  the  respective  ovals, 

y  =  1   and   x  =  -1/2 

and 

x  =  1   and   y  =  1/2  . 

However ,  for  the  oval 

|z  -  (-2  +  i)|  |z  -  1|  <  4(/T  +  1), 

- 

the  line  joining  their  "foci"  has  the  equation 

x  +  3y  =  1 , 

and  the  perpendicular  bisector  of  this  segment  joining 

(-2,  1) 

and  (1,  0)  has  the  equation 

-3x  +  y  =  2. 

Hence,  to  find  the  four  points  on  the  curve,  we  must  solve  the 
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two  systems  of  equations 

(x2  +  4x  +  y2  -  2y  +  5)1/2(x2  -  2x  +  y2  +  1)1/2  =  4(/5~  +  1) 

x  +  3y  =  1 

and 

?         2  1/2   2         2      1/2      i— 

(x   +  4x  +  y   -  2y  +  5)  '  (x   -  2x  +  y   +  1)     =  4 ( /5  +  1) 

-3x  +  y  =  2. 

The  solutions  to  these  two  pairs  of  equations  were  obtained 
on  the  IBM  360  computer  and  they  are  given  on  the  graph  to  four 
decimal  places . 

As  in  the  graph  of  the  Gersgorin  discs,  the  intersection  of 
these  three  ovals  of  Cassini  lies  within  the  shaded  portion.   The 
reader  is  reminded  that  these  are  the  three  ovals  obtained  from 
the  rows ,  and  that  the  three  other  ovals  obtained  from  the  columns 
may  even  further  decrease  the  area  in  which  the  characteristic 
roots  for  this  particular  matrix  may  lie.   Since  both  graph  #2 
and  graph  #3  are  to  the  same  scale,  the  reader  may  hold  the  two 
graphs  to  the  light  and  see  that  these  ovals  do  indeed  lie 
interior  to  the  intersection  of  the  Gersgorin  discs . 

We  shall  now  show  that  these  bounds  given  by  the  ovals  of 
Cassini  can  be  improved.   These  improvements  will  be  derived 
mathematically,  and  then  shown,  indeed,  to  be  improvements.   The 
first  of  these  bounds ,  which  is  also  a  set  of  ovals  ,  is  contained 
in  the  next  theorem. 

Theorem  (III.  4).   Let  A  =  (a,-,)  be  a  square  matrix  of  order 
n,  and  define 


GRAPH  #2 
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GRAPH  #3 
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n 
Pk      =     KaIPA    +    'aXkl(Pk   "    Ka^    +Z     |akvaAvl    + 


v=l 


A  'akvaAy  +  akyaAv''  where  y^k,A,v?fk,A,k*A. 
v<y 

Then  each  characteristic  root  of  A  lies  in  or  on  at  least  one  of 

the  ovals : 

'z  -   akkMZ  "  *aa'   lPkA 

Proof.   Let  (x, ,  x„,  ....  x  )  be  the  non-trivial  solution  of 
1    I '        n 

the  system  of  linear  equations,  [1,2],  associated  with  w.   Let 
lxml  1  lxvl  -    lxi.l>  i  =  X>  2>  •••'  n>  *  *   m'  *  *   v- 

XX,  -f-  Vi 

As  in  the  proof  of  Theorem  (III.  3),  consider  the  m   and  v 

equations  in  the  form: 

n 


-I 


wx   -  a   x  -      ?      a  ,  x, 
m    mm  m    /_±       mA  A 

X  =  l 

A^m 


and 

n 


i 


wx   -  a  x   =  /   a  ,  x,  . 
v    vv  v    [_i       vA  A 

A  =  l 

A^v 

If  x   =  0,  we  have  w  =  a    as  in  Theorem  (III. 3),  so  that  w  is 
v  mm  ' 

at  one  of  the  "foci"  of  all  ovals  formed  using;  a   .   Hence,  the 

&  mm         ' 

inequality  is  trivially  satisfied. 

If  x  i    0,  we  have,  after  multiplying  the  respective  sides 
of  the  two  equations  together, 
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n        n 

*C" — »          \- — J 

(w  -  a   )  (w  -  a   )x  x   =   /   a  -,x,  / 
mm       vv  m  v     /  »   mA  A  /  v 

a  x   , 

vy  y 

X=l       y=l 

Aj^rn       y^v 

which  we  can  write  as 

n 

^v — * 

' 

[111,3]       (w  -  a   )(w  -a   )xx   =a  x   > 
'              mm       vv  m  v    mv  v  £_  x 

a   x   + 
vy  y 

y  =  l 

■ 

y.tv 

n          n 

a   x   /   a  ,  x,  +  /   a  ,  a  ,x,  +  /  (a  ,a    + 
vm  m  /  _t   mA  A    /  .   mA  vA  A   /  t   mA  vy 

a   a  ,  )x,  x  . 

my  vA   Ay 

A=l         A=l            y^m,v 

A^v,m       A^m,v          A^m,v 

• 

A<y 

The  various  terms  of  the  right  side  of  equality- 

[111,3]  are 

obtained  as  follows : 

. 

n 

r  1 

Term  1.   Remove  a  x   from  /   a  ,x,  and  multiply  it  by 

mv  v      /  y   mA  A 

X=l 

Urn 
n 

^—7 

/   a   x  . 

/  v   vy  \i 

U=l 

. 

^- 7 

Term  2 .   Remove  a   x   from  /ax   and  mi 

vm  m      /_!.   vy  y 

iltiply  it  by 

y  =  i 

■ 

/  »   mA  A  " 

A  =  l 

A^v,m 

Term  3.   This  is  the  sum  of  all  products  of 

two  coefficients 

in  the  same  position  times  the  square  of  the  x. 

in  that  position. 

Term  4.   This  is  the  sum  of  two  products,  the  first  obtained 

n 

by  multiplying  the  element  in  the  i    position  c 

f  >   a  , x,  by  the 

A  =  l 
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n 
element  in  the  j    position  of  /       a   x   and  the  second  by  multi- 

th  V7 

plying  the  element  in  the  j    position  of  /       a  ,x,  by  the  element 

X  =  l 

n 

in  the  i    position  of  /   ax   and  summing  all  of  these. 

y=i 

By  taking  absolute  values  of  both  sides  of  [111,3],  and 

applying  the  triangle  inequality,  we  obtain  the  following 

inequality : 

n 

[111,4]      |w  -  a   I Iw  -  a   llx  x    <  la   llx  I  >   la   ||x  |   + 
'        '      mm11      vv ' '  m  v '  —  '  mv  '  '  v  '  /  t  '  vy ' '  y1 


a    x 
vm '  '  m ' 


y-i 

y^v 

n 

n 

Z  is«J 

IxJ 

+     )        la    .a    ,  | 

|xx|2+ 

A  =  l 

A  =  l 

A^v,m 

A^m,v 

I 


a  ,  a    +  a   a  ,   x,  x 
mA  vy     my  vA ' '  A  y 


A<y 

A^m,v 

y^m,v 

For  this  proof,  we  assumed  that 

| x  |  >  | x  I  >  Ix.l,  i  =  1,  2,  ...,  n. 

1  m  —   v   —   i 

Thus,  without  loss  of  generality,  we  can  normalize  this  vector 

on  any  x.  t    0,  i  i  m,  making  this  component  unity.   Hence ,  fro 

terms  2  and  3,  the  following  inequalities  are  true: 


m 


n  n  n  n 

|i=l  y  =  l  A  =  l  A  =  l 

y^v  y^v  A^v,m  Ai^v,m 
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On  using  the  notation  previously  described  for  row  radii,  [III,  4] 
becomes 

[111,5]  '   |w  -  a   I |w  -  a    |x  x    <  la    Ix  |P  +  la   llx  I (P 

l-l-l-ljjj  |w    "jujjI  i      vv1  '  m  v1  —  '  mv '  '  v '  v  '  vm1  '  m1   m 


n 
a_..|)  +£  |amXavX||xxr  +  ^  |a^a_  +  a_a  ,,  |  |x,x 
A  =  l  A<y 


1  mA  vA ' '  A '     /j  '  ml  vp    my  vA ' '  A  y 


A^m,v  A^m,v 

y^m,v 

The  above  normalizing  also  yields  the  following  set  of  inequalities 
for  A,  y  <  m: 

[111,6]  Ix  x    >  Ix  I  ,   Ix  x  I  >  Ix  | , 

'  '  m  v '  —  '  v '    'mv1— 'm' 

Ix  x    >  Ix, I  ,   Ix  x  I  >  Ix, X  I . 
'mv1— 'A1'   'mv'— 'Ay' 

Thus,  upon  dividing  both  sides  of  inequality  [111,5]  by 

|x  x  I,  the  inequality  remains  valid  and  can  be  written  as 
'  m  v '  ^     J 

[111,7]       |w  -  a   I |w  -  a   I  <  la   |P   +  la   I (P   -  la   I)  + 
'         !      mm ' '      vv '  —  '  mv '  v    '  vm '   m    '  mv ' 


Z  |amAavAl  +L 
A=l  A<y 

A^m,v  A^m,v 

y^m,v 


a  ,  a    +a   a  ,   =  P 
mA  vy    my  vA '     mv 


Hence  w  lies  in  or  on  at  least  one  of  the  ovals 


z  -  a    z  -  a    <P   , 

1      mm ' '      vv '  —  mv 

which  proves  the  theorem. 

To  verify  that  P    <  P  P  ,  we  note  that  in  the  proof  of 
J  mv  —  m  v  r 

Theorem  (III.  3),  involving  P  P  ,  the  last  operation  was  to  divide 

&   m  v  r 
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both  sides  of  the  inequality  [111,1]  by  |x  x  |.   However,  this 

quantity  appeared  on  both  sides  of  the  inequality,  so  it  divided 

to  unity.   On  the  other  hand,  in  the  proof  of  Theorem  (111.4), 

involving  P   ,  when  we  divided  both  sides  of  inequality  [111,53 
5  mv  -l  j 

bv  |x  x  I ,  we  obtained  unity  on  the  left,  but  not  on  the  right 
unless,  in  fact,  the  inequalities  in  [111,6]  were  actually  all 
equalities.   Hence,  the  ovals  of  Theorem  (III.  4)  are  at  least  as 
good  as,  and  will  in  general  be  better  than,  the  original  ovals 
of  Cassini  obtained  in  Theorem  (III.  3). 

One  might  ask  whether  or  not  these  ovals  have  any  practical 
advantage  over  the  Gersgorin  discs,  and  here  is  one  of  the  reasons 
why.   First,  note  that  since  the  Gersgorin  discs  are  -easily 
constructed,  these  should  be  the  first  bounds  to  consider  when 
investigating  the  location  of  the  characteristic  roots  of  a  given 
matrix.   However,  if  the  bounds  obtained  by  using  them  are  still 
not  sufficiently  accurate,  we  then  resort  to  using  the  ovals  of 
Cassini.   As  has  been  stated,  the  ovals  lie  within  the  union  of 
the  Gersgorin  discs,  and  thus,  for  instance,  if  one  questions 
whether  or  not  a  point  near  the  boundary  of  a  disc  is  an  upper 
bound  for  a  characteristic  root,  it  may  be  confirmed  by  using  the 
ovals . 

Then  too,  it  will  not  always  be  necessary  to  consider  all  of 

the  ~ ovals .   For  example ,  if  we  know  that  a  root  lies 

within  the  circle  |z  -  a, ..  |  <_  P  ,  then  we  need  only  consider  the 
(n  -  1)  ovals 


z  -  an  I  I z  ~  axx  I  .1  P1PA  >  *  =  2,  3,  ...,  n. 
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Furthermore  all  of  these  may  not  need  to  be  considered  if  suffi- 

st 
cient  bounds  are  obtained  prior  to  the  (n  -  1)    oval. 

If  we  specialize  our  matrix  and  require  that  A  =  (a,,  )  be  a 

square  matrix  of  order  n  with  real  elements,  we  can  obtain  the 

following  theorem,  which  is  very  similar  to  Theorem  (III. 4).   This 

theorem  and  its  formal  proof  can  be  found  in  [2,  e,  p.  557]. 

Theorem  (III. 5).   Let  A  =  (a,  , )  be  a  square  matrix  of  order 

n  with  real  elements.   For  each  given  k  and  X  we  denote  the  sum 

of  the  positive  terms  of 

n 


z 


kX   /  A   kv  Xv 

v  =  l 
v*k,X 

by  U,  ,,  and  the  sum  of  the  negative  terms  of  this  S,  ,  by  V.,  ,  and 
denote  max(U,,,  | V  , | )  by  m,, .  In  a  similar  fashion  as  we  did  in 
Theorem  (III.  4),  we  set 


PkX  =  KxlPA  +  'aXkl(Pk  "  iakxl}  +  mkA  +X  'akvaXy  +  akyaXv'- 


Then  each  real  characteristic  root  w  of  A  must  lie  in  at  least 

one  of  the  closed  intervals  formed  by  the  ovals 

2 

lz  "  akkNz  -  axx'  1  pkx> 

and  the  real  axis. 

The  proof  is  very  similar  to  the  proof  of  Theorem  (III. 4), 
The  only  difference  between  the  two  theorems  is  that  the  third 
term  of  P,  .  ,  namely, 
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n 

z 


kv  Av ' 


v=l 

ft 

is  replaced  by  mkA  =  max  (Uk^,  I  vk;J  )  in  pk^ • 

We  see  that  m,  ,  is  either  the  sum  of  the  positive  terms  of 

S,  ,  or  the  sum  of  the  absolute  values  of  the  negative  terms  of 
kA 

S,  ,  ,  whichever  is  larger.   Thus, 

n 


<-L 


mkA  ±  Z^  |akvaXvl' 
•v=l 


and  this  gives  us 


ft 

P     <  P 

*kX  -  rkA 


Hence,  Theorem  (III. 5)  gives  a  sharper  bound  than  Theorem  (III. 4) 
did,  but  we  must  remember  that  this  latter  theorem  can  be  applied 
only  to  the  real  characteristic  roots  of  a  matrix  with  real 
elements . 
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IV.   BOUNDS  FOR  THE  CHARACTERISTIC  ROOTS  OF  STOCHASTIC  AND 
GENERALIZED  STOCHASTIC  MATRICES 


All  of  the  bounds  obtained  so  far  with  the  exception  of  the 
bound  obtained  in  Theorem  (III. 5)  have  been  for  arbitrary  square 
matrices  of  order  n  with  real  or  complex  elements.   Suppose  now 
that  we  restrict  the  matrices  under  consideration  in  different 
ways.   For  the  following  development,  all  elements  of  the  matrix 
A  =  (aVT,)  are  assumed  to  be  non-negative. 

We  call  a  square  matrix  A  =  (a,, )  of  order  n  stochastic  if 

n 
[IV, 1]  V      -  1   k  -  1   2       n 

A  =  l 
and  positive  stochastic  if  a,,  t    0  for  all  k  and  A.   This  defini- 
tion is  extended  by  calling  a  square  matrix  A  =  (a,^)  of  order  n 
generalized  stochastic  if 


n 

2^ 


"kA 
A  =  l 


where  g  is  some  constant,  and  positive  generalized  stochastic  if 

a,  ,  /  0  for  all  k  and  A . 
kA 

By  looking  at  the  defining  system  of  linear  equations  for  a 
characteristic  root,  [1,2], 

IL 

kA  A      k 


n 

I 

X=l 


we  see  that  for  a  stochastic  matrix,  w  =  1  is  a  characteristic 
root  and  (1,  1,  ...,  1)'  is  a  characteristic  vector  corresponding 
to  w  =  1 . 
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For  a  stochastic  matrix,  all  of  the  row  sums  are  one.   Hence, 
as  pointed  out  in  the  comment  following  the  proof  of  Theorem 
(III. 2),  all  of  the  characteristic  roots  of  a  stochastic  matrix 
must  lie  in  or  on  the  unit  circle  |z|  <_  1.   Since  S  =  max,  S,  =  1, 
the  Gersgorin  disc  will  be  the  unit  circle  if  and  only  if  a,,  =  0 
for  a  particular  A.   Otherwise,  since  |  a,  ,  |  +  P,  =  S,  =  1 ,  the 
Gersgorin  discs  will  all  lie  within  the  unit  circle,  and  they 
will  be  tangent  to  it  at  exactly  one  point .   This  is  the  point  of 
intersection  of  the  line  joining  a,,  and  the  origin  with  the  unit 
circle.   This  is  also  a  consequence  of  the  next  theorem  to  be 
proved,  Theorem  (IV. 1),  but  the  result  is  already  intuitively 
obvious . 

Another  carry-over  to  stochastic  matrices  is  the  following. 

Theorem  (IV.l).   If  a,  ,  =  min,..a..,  i  =  1,  2,  ...,  n,  for 
any  stochastic  matrix  A  =  (a,,),  then  all  the  characteristic 
roots  lie  in  or  on  the  circle 

[IV, 3]  |z  -  akk|  <  1  -  akk.  _   . 

Proof.   This  reduces  to  the  unit  circle  if  a,,  =  0. 

If  no  a,,  =  0 ,  this  becomes  the  circle  with  center  at  a,  , 
kk  kk 

with  the  largest  possible  radius,  1  -  a  ,  ,  of  any  of  the  Gersgorin 
discs,  and  will  include  all  of  the  other  discs  with  centers  at 
a^ ,  X  =  1,  2,  ...,n,  A^k,  and  with  smaller  radii  1  -  a,,. 
Hence,  all  of  the  roots  will  lie  in  or  on  the  circle  [IV, 3]. 

Instead  of  having  = ovals  of  Cassini  to  consider  in 

obtaining  bounds  for  the  characteristic  roots ,  we  shall  prove 
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that  only  one  oval  is  needed  which  will  contain  all  of  the  roots 

of  a  stochastic  matrix.   The  following  two  lemmas,  stated  without 

proof,  will  be  needed  in  the  development.   For  their  proofs, 
see  [2,  d,  pp.  78-80]. 

Lemma  (IV. 1).   Let  a,  b,  c  and  k.be  real  numbers  satisfying 
a  <  c  <  k  and  b  <  c  <  k.   Then  the  oval 

| z  -  b | | z  -  c |  <  (k  -  b)(k  -  c) 

lies  in  the  interior  of  the  oval 

|z  -  a| |z  -  b|  <  (k  -  a)(k  -  b) , 

and  z  =  k  is  the  only  common  point  on  the  contours  of  both  regions 

Lemma  (IV.  2).   Assume  that  a,  <_  a„  <  .  .  .  <_  a   <  k.   Each  of 
the  ovals 

[IV, H]         |z  -  a  ||z  -  aA|  <  (k  -  a  ) (k  -  ax), 

p,  A  =  1,  2,  ...,  n,  p  <  A,  is  either  identical  with  the  oval 

[IV, 5]  |z  -  a1||z  -  a2|  <  (k  -  a-^tk  -  a?) 

or  lies  in  the  interior  of  the  oval  [IV,  5].   The  point  z  =  k  is 
the  only  common  point  of  the  boundaries  of  the  two  different 
ovals  [IV, 4]. 

Assuming  these  lemmas,  we  are  now  in  a  position  to  apply 
them  to  a  stochastic  matrix  in  obtaining  an  oval  which  will  con- 
tain all  of  the  characteristic  roots . 
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Theorem  (IV. 2).   Let  A  =  (a,,)  be  a  stochastic  matrix  of 
order  n.   Let 

a    <  a,,  <  a..,  i  =  1,  2,  .  ..,  n,  i  i    r,  t. 

rr  —  tt     n 

Then,  all  of  the  characteristic  roots  of  A  lie  in  or  on  the  single 
oval 

[IV, 6]       |z  -  arr||z  -  att|  <  (1  -  a^Ml  -  a^). 

Proof.   The  proof  is  a  direct  consequence  of  our  previous 
Theorem  (III. 3)  on  the  ovals  of  Cassini  combined  with  Lemma  (IV. 2) 
From  Theorem  (III. 3),  we  know  that  all  of  the  characteristic  roots 
of  A  lie  in  or  on  at  least  one  of  the  = ovals  'of  Cassini 


z  -  akk  |  |  z  -  aAA  |  <_   PkPA  ,k,  X  =  1 ,  2,  ,_.;,  n,  k  H 


But,  for  A  a  stochastic  matrix. 


P.  =  (1  -  a,  .  )   and   P,  =  (1  -  a,  ,  )  . 
k         kk         X         XX 

Hence ,  all  of  the  characteristic  roots  lie  in  or  on  at  least  one 

t.   .,   n(n  -  1)    , 

of  the  = ovals 


[IV, 7]       |z  -  akk||z  -  aAA|  <  (1  -  a^Xl  -  a,,), 

for  k,  X  =  1,  2,  ...,  n,  k  /  X.   If  we  apply  Lemma  (IV.  2)  to  the 
elements  of  A,  where  k  =  1,  we  have 

a,  <  a«  <  . . .  <  a  <  1 . 

1  —   2  —     —  n 

Applying  Lemma  (IV. 2)  to  the  diagonal  elements  of  A  in  a  very 
slightly  modified  form,  we  have 
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a    <  a, ,  <  a . .  <  1. 
rr  —  tt     11  - 

Thus,  since  a    and  a^   are  less  than  one,  we  have  each  of  the 
'        rr      tt 

ovals  of  [IV, 7]  lying  within  the  single  oval  of  [IV, 6].   Hence, 
the  theorem  is  proved. 

The  result  of  this  theorem  can  easily  be  carried  over  to  a 
generalized  stochastic  matrix. 

Theorem  (IV.  3).   If  A  =  (a,-,)  is  a  generalized  stochastic 
matrix  with  row  sum  g,  and  if 

arr  -  att  <  aii'  i  =  1>2'  ...»  n,.  i  *  r,  t, 

then  all  of  the  characteristic  roots  of  A  lie  in  or  on  the  single 
oval 


z  -  a 

rr 


IN  "  att|  <  (g  -  arr)(g  -  att) 


The  proof  of  this  is  similar  to  the  proof  of  Theorem  (IV. 2)  and 
is  therefore  omitted. 

Recall  that  for  a  stochastic  matrix  w  =  1  is  a  trivial 
characteristic  root.   Similarly,  for  a  generalized  stochastic 
matrix  with  row  sum  g,  w  =  g  is  a  trivial  characteristic  root. 
Note  that  w  =  1  will  lie  on  the  boundary  of  the  oval 

|z-a    I  z  -  a ,    <(l-a   )(1-  a,.), 
i      rr  i  i      -ti;  l  —       rr        tt 

where 

a    <  a..  <  a..,  i  =  1,  2,  ...,  n,  i  i   r,  t. 
rr  —  tt  —  n 

Also,  for  the  generalized  stochastic  matrix,  w  =  g  will  lie  on 
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the  boundary  of  the  oval 

Iz-a    I  z  -  a ,    <  (g  -  a   )(g-  a,,), 
i  *   "rr  i  i      -t-t '  —  &    rr   &    tt 

But,  what  can  be  said  about  the  non-trivial  characteristic  roots 
of  a  .stochastic  or  generalized  stochastic  matrix?   Are  there 
smaller  bounds  which  will  necessarily  include  all  of  these  roots? 
These  questions  are  answered  with  the  following  theorem,  stated 
without  the  proof,  which  can  be  found  in  [2,  d,  p.  89], 

Theorem  (IV. 4).   Assume  that  m  is  the  smallest  off-diagonal 
element  of  the  positive  stochastic  matrix  A  =  (a,  J  of  order  n 
and  that  a,,  and  a„„  are  the  smallest  elements  of  the  main 
diagonal.   Then  all  the  non-trivial  characteristic  roots  lie  in 
or  on  the  oval 

|z  -  (a-,-,  -  m)||z  -  (a22  -  m)  |  <_ 

{1  -  a,,  -  (n-l)mHl  -  a.        -  (n-l)m}. 

Note  that  this  theorem  will  also  be  true  if  A  is  a  stochastic 
matrix  rather  than  a  positive  stochastic  matrix.   For  this  case 
m  would  equal  zero,  and  this  would  then  be  the  same  bound  we  had 
from  Theorem  (IV. 2).   But,  m  t    0  and  hence  the  oval  from  Theorem 
(IV. 4)  is  strictly  smaller  than  the  oval  from  Theorem  (IV. 2),  and 
the  trivial  root  w  =  1  will  indeed  lie  outside  this  new  oval. 

If  we  extend  Theorem  (IV. 4)  to  generalized  stochastic 
matrices,  we  obtain: 

Theorem  (IV. 5).   Assume  that  m  is  the  smallest  off-diagonal 
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element  of  the  positive  generalized  stochastic  matrix  A  =  (a^) 
with  row  sum  g,  and  that  a,-,  and  a22  are  the  smallest  elements  of 
the  main  diagonal.   Then  all  of  the  non-trivial  characteristic 
roots  lie  in  or  on  the  oval 

|z  -  (a,,  -  m)  |  |  z  -  (a22  -  m)  |  <_ 

{g  -  a,,  -  (n-l)mHg  -  a22  -  (n-l)m}. 

Obviously,  the  above  five  theorems  on  stochastic  matrices 
comprise  merely  an  introduction  to  the  theory  of  the  localization 
of  their  characteristic  roots.   However,  these  bounds  do  serve 
as  rudiments  which  the  interested  reader  may  incorporate  into 
his  further  study  on  the  subject.   Such  a  study  of  localization 
theory  for  the  characteristic  roots  of  stochastic  matrices  may  be 
done  in  many  fields,  and  in  particular,  the  field  of  probability 
and  mathematical  statistics,  where  it  is  extremely  important  in 
applications . 
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V-   CONCLUSION 

As  has  been  stated  earlier,  one  may  easily  construct  some  of 
the  elementary  bounds  for  the  characteristic  roots  of  a  matrix. 
However,  in  order  to  improve  upon  these  bounds,  one  must,  in 
general,  sacrifice  both  ease  of  computation  and  simplicity  of 
construction.   This  continues  until  the  geometric  interpretation 
of  a  bound  becomes  almost  impossible  to  visualize.   For  example, 
the  bounds  obtained  from  Theorem  (III.  4)  and  (III.  5)  are  of  this 
nature . 

The  order  of  presentation  of  the  bounds  in  this  paper  has 
been  determined  by  two  factors.   These  are  the  increasing  of  the 
accuracy  of  the  bound,  and  the  decreasing  of  the  ease  of  computa- 
tion of  the  bound.   It  is  true  that  these  factors  coincide  for 
most  of  the  bounds,  as  stated  above.   Nevertheless,  there  are 
instances  in  the  paper  where  they  differ.   The  bound  from  Theorem 
(II.  5)  is  of  this  nature.   From  Graph  #1  it  is  seen  that  the  disc 
from  Theorem  (II. 5)  is  larger  than  the  disc  from  Theorem  (II. 3), 
but  it  is  much  easier  to  calculate  the  maximum  S,  and  the  maximum 
T,  and  then  use  this  product  as  the  radius  of  the  disc  than  it  is 
to  calculate  the  maximum  S,  T   for  all  k. 

If  one  is  posed  with  a  practical  problem,  such  as  may  arise 
in  the  construction  of  a  bridge  or  other  solid  structure,  where 
an  upper  bound  on  the  characteristic  roots  of  a  particular  matrix 
is  desired,  which  bound  should  he  use?   The  answer  to  this  question 
depends  upon  many  factors ,  and  probably  the  most  important  factor 
is:   will  the  structure  be  built  to  meet  only  the  minimum 
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requirements  for  strength  and  durability  or  will  it  be  built  to 
withstand  a  stress  much  greater  than  the  maximum  stress  it  is 
expected  to  receive?   If  the  former  is  the  case,  one  would 
probably  want  a  very  accurate  bound,  and  thus,  the  ovals  of 
Cassini  from  Theorem  (III. 3)  may  suffice.   If  a  still  better 
bound  is  needed,  Theorem  (III. 4)  or  Theorem  (III. 5)  may  be  used. 
On  the  other  hand,  if  the  latter  is  the  case,  a  somewhat  rougher 
bound  may  be  sufficient.   In  this  case,  a  good  one  to  use  is  the 
disc  from  Theorem  (III.  2)  since  it  is  easy  to  apply.   Although 
this  bound  may  be  easier  to  apply,  it  may  not  be  a  sufficiently 
accurate  bound.   Hence,  one  may  choose  to  use  the  Gersgorin  discs 
from  Theorem  (III.l). 

Therefore,  depending  upon  the  desired  degree  of  accuracy 
needed  for  a  given  situation,  one  may  choose  the  bound  which  is 
best  suited. 
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The  theory  on  bounds  for  the  characteristic  roots  of  a 
matrix  may  be  classified  as  a  branch  of  the  theory  of  inequalities 
Both  upper  bounds  and  lower  bounds  can  be  found,  but  we  shall 
restrict  our  consideration  to  upper  bounds  only.   These  bounds 
are  derived  as  geometric  configurations.   Some  are  circles  and 
some  are  ovals . 

The  k   row  sum  of  a  square  matrix  of  order  n  is  defined  as 

n 

Sk  =  Z  |ak\| 

where  k  =  1,  2,  ...,n.   Similarly,  the  \        column  sum  is  defined 

as  n 

Tx  =  Z  |akX|  , 

where  A  =  1,  2,  ...,  n. 

The  first  important  circular  bound  established  is  that  all 

of  the  characteristic  roots  w  of  an  arbitrary  square  matrix  of 

order  n  lie  in  or  on  the  circle 

1/2 
|z|  <  {(max  S,  )  (max,  T.  )  } 

This  bound  is  both  proved  mathematically  and  depicted  graphically. 

Using  this  same  notation  for  row  sums  and  column  sums,  it 
is  then  proved  that  all  of  the  roots  lie  in  or  on  the  circle 

S,  +  T, 
| z |  <  maxk    k  2   k 


The  k   row  radius,  P,  ,  is  defined  as  P,  =  S, 


th 

'k,  *o  ^,.^  a&  rk  -  ok  -   |dkk|. 

Similarly,  the  A  L  column  radius,  Q    is  defined  as  Q,  =  T,  -  la,, 

A  A     A     ■  AA 

For  a  square  matrix  of  order  n,  the  n  discs  with  centers  at 


an ,  and  radii  of  P,  are  called  the  Gersgorin  discs  for  the  rows 
kk  k 

of  the  matrix.   If  the  radii  are  the  C\,  they  are  the  Gersgorin 
discs  for  the  columns  of  the  matrix. 

It  is  proved  that  all  of  the  characteristic  roots  of  a 
square  matrix  lie  in  or  on  both  the  union  of  the  n  Gersgorin 
discs  for  the  rows  and  also  the  union  of  the  n  Gersgorin  discs 
for  the  columns.   Hence,  they  lie  in  the  intersection  of  the  2n 
discs . 

If  the  row  radii  are  taken  in  pairs,  the  bound  established 
is  that  all  of  the  characteristic  roots  w  lie  in  or  on  the 

— ^— ^ ovals  of  Cassini 

lZ  ~  akkHZ  "  aA*l  ±PkV   . 
A  similar  set  of  Z ovals  of  Cassini  for  the  columns  is 

lz  "  akkl'z  -  aX\\  1  W 

Thus,  all  of  the  roots  lie  within  the  intersection  of  these 
n(n  -  1)  ovals. 

These  ovals  are  proved  to  be  better  than  the  Gersgorin  discs 

The  intersection  of  the  ^ ovals  for  the  rows  are  depicted 

graphically,  and  a  considerable  improvement  over  the  Gersgorin 
discs  can  be  observed. 

By  making  use  of  several  stated  lemmas,  it  is  proved  that 
all  of  the  characteristic  roots  of  a  stochastic  matrix  lie  in  or 
on  the  single  oval 


lZ  "  arr  Mz  "  attl  <  (1  "  arr)(1  "  att>' 


where  a    and  a.   are  the  two  smallest  diagonal  elements, 
rr      tt  • 

A  smaller  oval  than  the  above  is  defined  in  terms  of  the 
two  smallest  diagonal  elements,  say  a,-,  and  a?2,  and  the  minimum 
off-diagonal  element,  say  m.   The  result  is  that  all  of  the 
characteristic  roots  of  a  stochastic  matrix  of  order  n  lie  in  or 
on  the  oval 

|z  -  (a-,.,  -  m)||z  -  (a22  -  m)  |  <_  {1  -  a..,  -  (n-l)mHl  -  a22  -  (n-l)m} 

All  of  the  results  for  stochastic  matrices  are  also  proved 
for  generalized  stochastic  matrices  where  1  is  replaced  by  2, 
the  constant  row  sum. 


